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Abstract: A major issue in financial economics is the behavior of asset returns over long horizons. 
Various estimators of long range dependence have been proposed. Even though some have known 
asymptotic properties, it is important to test their accuracy by using simulated series of differ- 
ent lengths. We test R/S analysis, Detrended Fluctuation Analysis and periodogram regression 
methods on samples drawn from Gaussian white noise. The DFA statistics turns out to be the 
unanimous winner. Unfortunately, no asymptotic distribution theory has been derived for this 
statistics so far. We were able, however, to construct empirical (i.e. approximate) confidence in- 
tervals for all three methods. The obtained values differ largely from heuristic values proposed by 
some authors for the R/S statistics and are very close to asymptotic values for the periodogram 
regression method. 
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1 Introduction 

Time series with long-range dependence are widespread in nature and for many years have been 
extensively studied in hydrology and geophysics Q ^, ||, |^ . More recently, " long memory" or " 1 // 
noise" (as it is often called) has been observed in DNA sequences |^ , cardiac dynamics g| , 
internet traffic , meteorology |ll| , geology (l2| and even ethology . 

In economics and finance, long-range dependence also has a long history (for a review see 
Refs. 11, 0) and still is a hot topic of active research 0, |l9|, |20[ || H, ||. 

Historical records of economic and financial data typically exhibit nonperiodic cyclical patterns that 
are indicative of the presence of significant long memory. However, the statistical investigations 
that have been performed to test long-range dependence have often become a source of major 
controversies, especially in the case of stock returns. The reason for this are the implications that 
the presence of long memory has on many of the paradigms used in modern financial economics 

El, H; 

Various estimators of long range dependence have been proposed. Even though some have 
known asymptotic properties, it is important to test their accuracy by using simulated series 
of different lengths. Such a study was presented in Refs. j2^, ^ using "ideal" models that 
display long-range dependence, i.e fractional Brownian noise (fBn) and Fractional ARIMA(0,d,0). 
However, the authors tested estimators on rather long time series (10000 elements), whereas in 
practice we often have to perform analysis of much shorter data sets. For example. Lux p6[ | used 
daily time series comprising 1949 DAX returns, Lobato and Savin ]lH originally performed tests on 
8178 S&P500 daily returns, but later due to the non-stationarity of the process (which influenced 
the estimates) divided the sample into smaller subsamples, Grau-Carles pO| | analyzed various 
stock index data including 4125 Nikkei and 1555 FTSE daily returns, Weron and Przybylowicz 
estimated the Hurst exponent using only 670 daily returns of the spot electricity price in 
California. Moreover, Taqqu et al. based their statistical conclusions on only 50 simulated 
trajectories of fBn or FARIMA. This may be enough for the estimation of the mean or median. 



but certainly not enough for the estimation of very high or low quantiles, which can be used to 
construct empirical confidence intervals ||2^ . 

In this paper we analyze rescaled range analysis, Detrended Fluctuation Analysis and pe- 
riodogram regression methods on samples drawn from Gaussian white noise. In Section 2 we 
precisely describe all three methods and later, in Section 3, present a comparison based on Monte 
Carlo simulations. The DFA statistics turns out to be the unanimous winner. Unfortunately, no 
asymptotic distribution theory has been derived for this statistics so far. We were able, however, 
to construct empirical (i.e. approximate) confidence intervals for all three methods. These results 
are presented in Section 4. The obtained values differ largely from heuristic values proposed by 
some authors for the R/S statistics and are very close to asymptotic values for the periodogram 
regression method. In Section 5 we apply the results of Section 4 to a number of financial data 
illustrating their usefulness. 



2 Methods for estimating H 
2.1 R/S analysis 

We begun our investigation with one of the oldest and best-known methods, the so-called R/S 
analysis. This method, proposed by Mandelbrot and Wallis ||2^ and based on previous hydrological 
analysis of Hurst Q, allows the calculation of the self-similarity parameter H , which measures the 
intensity of long range dependence in a time series. 

The analysis begins with dividing a time series (of returns) of length L into d subseries of length 
n. Next for each subseries m = 1, ...,c?: 1° find the mean {Em) and standard deviation {Sm)\ 2° 
normalize the data {Zi^m) by subtracting the sample mean X^.m — Zi^^n — Em for i — 
3° create a cumulative time series Y^.m — Y^]=i -^j.m for i = 4° find the range Rm = 

max{li_m, Yn,m} — ioin{yi,m, ^,m}; and 5° rescale the range Rm/ Sm- Finally, calculate the 
mean value of the rescaled range for all subseries of length n 



1 

{R/S)n — ^ Rrn/S., 



It can be shown |Q that the R/S statistics asymptotically follows the relation 

{R/S)n ^ cn". 

Thus the value of H can be obtained by running a simple linear regression over a sample of 
increasing time horizons 

log(i?/5)„ = log c + log n. 

Equivalently, we can plot the {R/S)n statistics against n on a double-logarithmic paper, see Fig. 
1. If the returns process is white noise then the plot is roughly a straight line with slope 0.5. If 
the process is persistent then the slope is greater than 0.5; if it is anti-persistent then the slope is 
less than 0.5. The "significance" level is usually chosen to be one over the square root of sample 
length, i.e. the standard deviation of a Gaussian white noise [ |3l| . 

However, it should be noted that for small n there is a significant deviation from the 0.5 
slope. For this reason the theoretical (i.e. for white noise) values of the R/S statistics are usually 
approximated by 




m/S)n={ , (1) 

for n > 340, 

where F is the Euler gamma function. This formula is a slight modification of the formula given 
by Anis and Lloyd the (n — ^)/n term was added by Peters to improve the performance 
for very small n. 
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Figure 1: Estimation of the Hurst exponent for a white noise {H = 0.5) sequence of 2^^ = 16384 
observations via the Hurst R/S analysis {top left panel), the Detrended Fluctuation Analysis {top 
right panel) and the periodogram Geweke-Porter-Hudak method {bottom panel). 



Formula was used as a benchmark in all empirical studies of the R/S statistics presented in 
this paper, i.e. the Hurst exponent H was calculated as 0.5 plus the slope of {R/S)n — E(i?/5')„. 
The resulting statistics was denoted by R/S-AL. 

A major drawback of the R/S analysis is the fact that no asymptotic distribution theory has 
been derived for the Hurst parameter H. The only known results are for the rescaled (but not by 
standard deviation) range Rm itself psf . 



2.2 Detrended Fluctuation Analysis 

The second method we used to measure long range dependence was the Detrended Fluctuation 
Analysis (DFA) proposed by Peng et al. [^. The method can be summarized as follows. Divide a 
time series (of returns) of length L into d subseries of length n. Next for each subseries m = 1, ...,d: 
1° create a cumulative time series Yi^m = for i — 2° fit a least squares line 

Ym{x) = amX + bm to {Vi^m , • ■ ■ , i^n,m} ! and 3° calculate the root mean square fluctuation (i.e. 
standard deviation) of the integrated and detrended time series 



F{m) 



1 " 

A / {Yi rn ^m)*^- 

\ ^=l 



Finally, calculate the mean value of the root mean square fluctuation for all subseries of length n 

d 



F{n) = ^ E ^( 



a 

rn—l 



Like in the case of R/S analysis, a linear relationship on a double-logarithmic paper of F{n) against 
the interval size n indicates the presence of a power-law scaling of the form cn^ If the 

returns process is white noise then the slope is roughly 0.5. If the process is persistent then the 
slope is greater than 0.5; if it is anti-persistent then the slope is less than 0.5. 
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Unfortunately, no asymptotic distribution theory has been derived for the DFA statistics so far. 
Hence, no exphcit hypothesis testing can be performed and the significance reUes on subjective 
assessment. 



2.3 Periodogram regression 

The third method is a semi-parametric procedure to obtain an estimate of the fractional differencing 
parameter d. This technique, proposed by Geweke and Porter-Hudak |Q (GPH), is based on 
observations of the slope of the spectral density function of a fractionally integrated series around 
the angular frequency w = 0. Since they showed that the spectral density function of a general 
fractionally integrated model (eg. FARIMA) with differencing parameter d is identical to that of 
a fractional Gaussian noise with Hurst exponent H = d + 0.5, the GPH method can be used to 
estimate H. 

The estimation procedure begins with calculating the periodogram, which is a sample analogue 
of the spectral density. For a vector of observations {xi, xl} the periodogram is defined as 



1 
L 



E 

t=i 



xte 



-27ri{t — l)uJk 



where ojk — k/L, k = 1, [L/2] and [x] denotes the largest integer less then or equal to x. Observe 
that II is the squared absolute value of the Fourier transform and if the observations vector is of 
appropriate length (even or a power of 2) then we can use fast algorithms to calculate the Fourier 
transform. 

The next and final step is to run a simple linear regression 



log{lL{uJk)} = a - dlog {4sin2(cjfc/2)} 



(2) 



at low Fourier frequencies oJk, k = 1,...,K < [L/2]. The least squares estimate of the slope yields 
the differencing parameter d through the relation d — d, hence H = d + 0.5. A major issue on 
the application of this method is the choice of K. Geweke and Porter-Hudak [Q, as well as a 
number of other authors, recommend choosing K such that K = [L"'^], however, other values (eg. 
K = [L°-% <K < [LO-5]) have also been suggested. 

Periodogram regression is the only of the presented methods, which has known asymptotic 
properties. Inference is based on the asymptotic distribution of the estimate d of d, which is 
normally distributed with mean d and variance 



(3) 



where xt = log{4 sin^(ajfc/2)} is the regressor in eq. 



3 Comparison of estimators 

In order to test the presented estimation methods we performed Monte Carlo simulations. We 
generated samples of Gaussian white noise sequences (independent and N{0, 1) distributed) of 
length L = 2^, where N = 8, 9,. ..,16, i.e. L = 256, 512, 65536. For each L ten thousand 
trajectories were produced. Next, we applied all three estimation procedures - Anis-Lloyd corrected 
R/S statistics, Detrended Fluctuation Analysis and periodogram regression - to the data series 
and compared the results. 

In Figure 2 we plotted the mean values (over all 10000 samples) of the estimated Hurst expo- 
nents. For illustrative purposes we also included the results of the classical R/S statistics (R/S), 
i.e. without the Anis-Lloyd correction. As can be seen in Fig. 2, on average, this method overes- 
timates the true Hurst exponent to a great extent. On the other hand, all other methods have a 
slight negative bias. 
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Figure 2: Mean values (over 10000 samples) of the estimated Hurst exponents for Gaussian white 
noise sequences of length L — 256, 512, 65536. The methods used are the classical R/S analysis 
(R/S) for subintervals of length n > 50, the Anis-Lloyd corrected R/S statistics (R/S-AL) for 
n > 50, the Detrended Fluctuation Analysis for n > 10 and n > 50, and the periodogram Geweke- 
Porter-Hudak method for cutoff K = [i^-^]. 



We analyzed the R/S and DFA methods for two cutoffs of the subinterval length: n > 10 and 
n > 50. Despite the corrections to the original rescaled range statistics, R/S-AL still possesses 
a large variance if very small n are included in the calculations. Thus we decided to use only 
subintervals of length n > 50 (to be more precise: n = 64,128,256,..., since the length of our 
samples was a power of 2). On the other hand, the DFA statistics behaves nicely even for small 
subintervals. So, we calculated Hjjfa for n > 10 (more precisely: n = 16,32,64, ...) and - to be 
consistent with the results for the rescaled range analysis - separately for n > 50. 

As we have mentioned in the previous Section, results of the Geweke-Porter-Hudak method 
depend on the choice of the cutoff value K, which determines how many of the low Fourier fre- 
quencies Uk are taken into account. In our simulations we decided to use the standard value, i.e. 
K = [L^'^], because lower powers of L introduce larger estimation errors, while larger powers force 
us to move away from the region for which the theoretical results hold. 

The large number of simulated trajectories allowed us to compare the methods. For a given 
method we obtained 10000 estimated values of H, called {Hi,i = 1, 2, 10000}. Apart from 
calculating their mean (see Fig. 2), we computed their standard deviation and mean absolute 
error, i.e. 

10000 

10000 ^ ' ' " 

i=l 

which provides some information on the bias. The results are presented in Table 1. To have a 
complete picture, in Fig. 3 we also plotted the 2.5% and 97.5% sample quantiles for all methods. 
A p% sample quantile, denoted by Xp, is such a value that p% of the sample observations are less 
than Xp. Equivalently we can say that the empirical distribution evaluated at Xp equals p%, 
i.e. Fe{xp) =p%. 

In all three tests the clear winner for L > 500 is the DFA statistics with n > 50 as it gives the 
estimated values closest to the initial Hurst exponent (H = 0.5). The reason it performs worse 
then the DFA statistics with n > 10 for the smallest tested samples is probably the fact that 
L = 256 has only three divisors greater than 50: n = 64, 128,256 and that regression based on 
only three points can yield large errors. 



5 




Sample size 

Figure 3: 2.5% and 97.5% sample quantiles (over 10000 samples) of the estimated Hurst exponents 

for Gaussian white noise sequences of length L = 256,512, ...,65536. The methods used are the 
Anis-Lloyd corrected R/S statistics (R/S-AL) for n > 50, the Detrended Fluctuation Analysis for 
n > 10 and n > 50, and the periodogram Geweke-Porter-Hudak method for cutoff K = [L^-^]. 



Table 1: Estimation results for H = 0.5 using 10000 independent realizations of different length. 
The best results in c^ach catc^gory arc in })ol(i. 



Sample 




Method 






size 


R/S-AL (n > 50) 


DFA (n > 10) 


DFA {n > 50) 


GPH 


Standard deviation 


256 


0.1739 


0.0972 


0.1222 


0.2205 


512 


0.1043 


0.0724 


0.0716 


0.1705 


1024 


0.0739 


0.0559 


0.0497 


0.1401 


2048 


0.0535 


0.0436 


0.0355 


0.1085 


4096 


0.0422 


0.0359 


0.0278 


0.0898 


8192 


0.0332 


0.0293 


0.0217 


0.0727 


16384 


0.0277 


0.0254 


0.0181 


0.0622 


32768 


0.0237 


0.0223 


0.0154 


0.0511 


65536 


0.0193 


0.0184 


0.0125 


0.0424 






Mean absolute 


error 




256 


0.1396 


0.0800 


0.1003 


0.1733 


512 


0.0838 


0.0589 


0.0582 


0.1357 


1024 


0.0594 


0.0455 


0.0401 


0.1108 


2048 


0.0428 


0.0355 


0.0287 


0.0861 


4096 


0.0337 


0.0291 


0.0224 


0.0711 


8192 


0.0266 


0.0240 


0.0176 


0.0580 


16384 


0.0224 


0.0209 


0.0147 


0.0494 


32768 


0.0189 


0.0181 


0.0124 


0.0402 


65536 


0.0155 


0.0150 


0.0101 


0.0335 
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Unfortunately, no asymptotic distribution theory has been derived for the DFA statistics so far. 
However, using Monte Carfo simulations we were able to construct empirical (i.e. approximate) 
confidence intervals for all three analyzed methods. 

4 Construction of confidence intervals 

The estimation of the Hurst exponent H alone is not enough. We also need a measure of the 
significance of the results. Traditionally, the statistical approach is to test the null hypothesis of 
no or weak dependence versus the alternative of strong dependence or long memory at some given 
significance level. However, to construct a test the asymptotic distribution of the test statistics 
must be known. Of the three analyzed methods only the spectral one has well know asymptotic 
properties. In this Section we will thus construct empirical confidence intervals. 

The procedure is quite simple, although burdensome, and consists of the following: 1° for a 
set of sample lengths (in our case: L = 256, 512, 65536) generate a large number (here: 10000) 
of realizations of an independent or a weakly dependent time series (here: Gaussian white noise); 
2° compute the lower (0.5%, 2.5%, 5%) and upper (95%, 97.5%, 99.5%) sample quantiles for all 
sample lengths; and 3° plot the sample quantiles vs. sample size and fit them with some functions. 
These functions can be later used to construct confidence intervals. The 5% and 95% quantiles 
designate the 90% (two-sided) confidence interval, 2.5% and 97.5% quantiles - the 95% confidence 
interval and 0.5% and 99.5% quantiles - the 99% confidence interval. 

Results of the above procedure for the Anis-Lloyd corrected R/S statistics are presented in 
Table 2 and Figs. 4-5. To find the 95% confidence interval we plotted 2.5% and 97.5% sample 
quantiles vs. sample size. The only satisfactory results were obtained for log(x97.5 — 0.5) and 
log(0.5 — X2.b) vs. log(logAf), where N = \0g2L, see Fig. 4. For sample size L = 256 the 97.5% 
quantile introduced large estimation errors, probably due to the fact that L = 256 has only three 
divisors greater than 50: n = 64, 128, 256 and that regression based on only three points can yield 
large errors. Thus we decided to use 97.5% quantiles (in fact 95% and 99.5% quantiles as well) 
coming only from samples of at least 512 observations. The obtained fit was very good, the 
statistics for the lower quantile was 0.9987 and for the upper 0.9972. The complete formulas, 
also for 90% and 99% confidence intervals are given in Table 2. In Figure 5 we plotted the mean 
value of H and the 95% confidence interval vs. sample size. For comparison we also added the 
heuristic "significance level" (one over the square root of sample length) given in Peters It 
is clearly seen that this "significance level" rejects the null hypothesis too often compared to the 
95% confidence interval (in fact too often even compared to the 90% confidence interval). 

Table 2: Empirical confidence intervals for the Anis-Lloyd corrected R/S statistics and sample 
length L = 2^. 



Level 






Confidence intervals for n > 50 
Lower bound Upper bound 




90% 
95% 
99% 


0.5 
0.5 
0.5 


- exp(- 

- exp(- 
~ exp(^ 


-7.35 • log(log N) + 4.06) exp(-7.07 • log(log N) + 3.75) - 
-7.33 • log(log iV) -1- 4.21) exp(-7.20 • log(log N) + 4.04) - 
-7.19 • log(logiV) -t- 4.34) exp(-7.51 • log(logiV) -f- 4.58) - 


hO.5 
hO.5 
hO.5 



Results for the DFA statistics are presented in Table 3 and Figs. 6-9. Again in order to find 
the 95% confidence interval we plotted 2.5% and 97.5% sample quantiles vs. sample size. The 
only satisfactory results were obtained for log(i;97.5 — 0.5) and log(0.5 — X2.5) against log TV, where 
N = log2 L, see Figs. 6 and 8. Like in the case of the R/S-AL statistics, for sample size L = 256 
the 97.5% quantile introduced large estimation errors when the subinterval size was restricted to 
n > 50, see Fig. 8. Thus we decided to use 97.5% quantiles (95% and 99.5% quantiles as well) 
coming only from samples of at least 512 observations when the higher {n > 50) cutoff was applied. 
The obtained fit was very good. In the n > 10 case the statistics for the lower quantile was 
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LS linear fit 



A^ 



0.7 



0.75 
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0.85 0.9 
log(log(N)) 



0.95 



1.05 



Figure 4: Fitting R/S-AL quantiles (n > 50): a plot of log(a;97.5 — 0.5) and log(0.5 — X2.5) vs. 
log(logA'^), where N = log2l/. For the smahest analyzed samples (L = 256) the 97.5% quantile 
introduced large estimation errors and was not used in the analysis. 
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Figure 5: Mean value of the Anis-Lloyd corrected R/S statistics, 95% empirical confidence intervals 
and 1/VL intervals for samples of size L = 256, 512, 65536. 
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0.9985 and for the upper 0.9972. In the n > 50 case the statistics for the fower quantile was 
0.9973 and for the upper 0.9918. The complete formulas, also for 90% and 99% confidence intervals 
are given in Table 3. In Figures 7 and 9 we plotted the mean value of H and the 95% confidence 
interval vs. sample size. 



Table 3: Empirical confidence intervals for the DFA statistics and sample length L — 2 



Level 




Confidence intervals for n > 10 
Lower bound Upper bound 




90% 


0.5 


- exp(-2.33 • logiV + 3.09) exp(-2.44 • log + 3.13) - 


hO.5 


95% 


0.5 


- exp(-2.33 • log TV + 3.25) exp(-2.46 • log TV + 3.38) - 


1-0.5 


99% 


0.5 


- exp(-2.20 • log TV + 3.18) exp(-2.45 • logTV + 3.62) - 


hO.5 


Level 




Confidence intervals for n > 50 
Lower bound Upper bound 





90% 0.5 - exp(-2.99 • log TV + 4.45) exp(-3.09 ■\ogN + 4.57) + 0.5 
95% 0.5 -exp(-2.93-logTV + 4.45) exp(-3.10 • log /V + 4.77) + 0.5 
99% 0.5 -exp(-2.67-logTV + 4.06) exp(-3.19 ■ log /V + 5.28) + 0.5 



Finally, results for the GPH statistics are presented in Table 4 and Figs. 10-11. The only 
satisfactory results for the 95% confidence interval were obtained by plotting log(i:97.5 — 0.5) and 
log(0.5 — X2.5) against TV^/'^, where TV = log2 L, see Fig. 10. The power | was chosen arbitrarily, 
however, comparably good results were possible in the range (0.6,0.7). The obtained fit was very 
good, the statistics for the lower quantile was 0.9953 and for the upper 0.9987. The complete 
formulas, also for 90% and 99% confidence intervals are given in Table 4. In Figure 11 we plotted 
the mean value of H and the 95% confidence interval vs. sample size. For comparison we also 
added the theoretical 95% confidence intervals, see formula (^). The empirical and theoretical 
intervals are quite close to each other, especially for large samples. For small samples the GPH 
statistics has a slight negative bias and the empirical values are shifted downward. Recall, however, 
that the theoretical results are derived from the limiting behavior and obviously cannot take into 
account finite sample properties. 

The small difference between the empirical and theoretical confidence intervals justifies our 
approach and permits us to use in practical applications the empirical confidence intervals given 
in Tables 2-4. 

Table 4: Empirical confidence intervals for the GPH statistics and sample length L — 2^ . 



Level 




Confidence intervals for K = [L*^-^] 
Lower bound Upper bound 




90% 
95% 
99% 


0.5- 
0.5- 
0.5- 


exp(-0.71 • TV2/3 + 1.87) exp(-0.68 • TV^/^ -t- 1.62) - 
exp(-0.71 • TV2/3 + 2.04) exp(-0.68 • TV^/^ + 1.78) - 
exp(-0.73 • TV2/3 + 2.45) exp(-0.65 • TV^/^ + 1.92) ~ 


hO.5 
hO.5 
hO.5 



5 Applications 

Having calculated the empirical confidence intervals we are now ready to test for long range de- 
pendence in financial time series. For the analysis we selected four data sets (two stock indices 
and two power market benchmarks): 

• 2526 daily returns of the Dow Jones Industrial Average (DJIA) index for the period Jan. 
2nd, 1990 - Dec. 30th, 1999; 
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log(N) 

Figure 6: Fitting DFA quantiles in > 10): a plot of log(a;97.5 — 0.5) and log(0.5 — 0:2.5) vs. log A^, 
where N = log2 L. 
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Figure 7: Moan value of the DFA statistics (n > 10) and 95% confidence intervals for samples of 
size L = 256, 512, ...,65536. 
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Figure 8: Fitting DFA quantiles (n > 50): a plot of log(i97.5 — 0.5) and log(0.5 — X2.5) vs. log A'', 
where N = log2 L. For the smallest analyzed samples {L = 256) the 97.5% quantile introduced 
large estimation errors and was not used in the analysis. 
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Figure 9: Mean value of the DFA statistics (n > 50) and 95% confidence intervals for samples of 
size i = 256, 512, ...,65536. 
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Figure 10: Fitting GPH quantiles: a plot of log(x97.5 — 0.5) and log(0.5 — X2.5) vs. N'^/^, where 
N = log2 L. For comparison the theoretical quantiles are also plotted. 
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Figure 11: Mean value of the GPH statistics and 95% confidence intervals for samples of size 
L = 256, 512, 65536. For comparison the theoretical quantiles are also plotted. The small 
difference between the empirical and theoretical confidence intervals justifies our approach. 
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Table 5: Estimates of the Hurst exponent H for financial data. *, ** and *** denote significance 
at the (two-sided) 90%, 95% and 99% level, respectively. 



Method 



Data 


R/S-AL 


DFA 


GPH 






Stock indices 




DJIA returns 


0.4585 


0.4195** 


0.3560 


DJIA absolute value of returns 


0.7838*** 


0.9080*** 


0.8357*** 


WIG20 returns 


0.5030 


0.4981 


0.4604 


WIG20 absolute value of returns 


0.9103*** 


0.9494*** 


0.8262*** 




Electricity price benchmarks 


CalPX returns 


0.3473* 


0.2633*** 


0.0667*** 


Entergy returns 


0.2995** 


0.3651** 


0.0218*** 



• 1560 daily returns of the WIG20 Warsaw Stock Exchange index (based on 20 blue chip stocks 
from the Polish capital market) for the period Jan. 2nd, 1995 - Mar. 30th, 2001; 

• 728 daily returns of electricity traded in the California Power Exchange (CalPX) spot market 
for the period April 1st, 1998 - March 29th, 2000; 

• 690 daily returns of firm on-peak power (OTC spot market) in the Entergy region (Louisiana, 
Arkansas, Mississippi and East Texas) for the period January 2nd, 1998 - September 25th, 
2000. 

The results of the Anis-Lloyd corrected R/S analysis (for n > 50), the Detrended Fluctuation 
Analysis (for n > 50 only) and the periodogram Geweke-Porter-Hudak method (for K — [L'^'^]) 
for these time series are summarized in Table 5. The significance of the results is based on values 
obtained from Tables 2-4. For example, the 90% confidence intervals of the R/S-AL statistics for 
DJIA returns {N = log2 2526 = 11.3026) were calculated as follows: 

lower bound: 0.5 - exp(-7.35 • log(log N) + 4.06) = 0.4138, 
upper bound: exp(-7.07 • log(log iV) + 3.75) + 0.5 = 0.5810. 

Similarly, we obtained confidence intervals for the 95% and 99% two-sided levels and other methods. 
All obtained values are presented in Table 6. For comparison we included the theoretical confidence 
intervals for the periodogram Geweke-Porter-Hudak method. The differences between the empirical 
(obtained from Table 4) and theoretical confidence intervals are small and the significance of the 
results is the same for both sets of values. 



Table 6: Confidence intervals for 2526 DJIA returns. 







Method 




Level 


R/S-AL 


DFA 


GPH 


GPH (theoretical) 


90% 
95% 
99% 


(0.4138,0.5810) 
(0.3980,0.5965) 
(0.3686,0.6258) 


(0.4392,0.5538) 
(0.4297,0.5641) 
(0.4106,0.5858) 


(0.3184,0.6645) 
(0.2847,0.6931) 
(0.2067,0.7583) 


(0.3305,0.6695) 
(0.2981,0.7019) 
(0.2346,0.7654) 



As expected ||T^, ^ |T^, ^ we found (almost) no evidence for long range dependence in the 
stock indices returns and strong - i.e. significant at the two-sided 99% level for all three methods 
- dependence in the stock indices volatility (more precisely: in absolute value of stock indices 
returns). On the other hand, electricity price returns were found to exhibit a mean-reverting 
mechanism (the Hurst exponent was found significantly smaller than 0.5), which is consistent with 
our earlier findings [^ . 
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